REPORT RESUMES 

ED 010 599 u 

A STUDY OF AN EXPLORATORY TECHNIQUE USED IN EDUCATIONAL 
RESEARCH. 

BY- DARROCH, JOHN N. 

MICHIGAN UNI V. , ANN ARBOR 

REPORT NUMBER CRP-S-431 PUB DATE 66 

REPORT NUMBER BR-5-D37B 

EDRS PRICE MF-S0.09 HC-|1 .0$ B7P. 

DESCRIPTORS- ♦EDUCATIONAL RESEARCH, MATHEMATICAL CONCEPTS, 
♦MATHEMATICAL APPLICATIONS, ♦MATHEMATICAL MODELS, ♦FACTOR 
ANALYSIS, ♦CALCULATION, MEASUREMENT TECHNIQUES, ANN ARBOR, 
MICHIGAN 

TO DEVELOP A BETTER MATHEMATICAL BASIS FOR FACTOR 
ANALYSIS, A MATHEMATICAL APPROACH WAS FORMULATED FOR FIXING 
COMMUNALI TIES AND THEIR COMPLEMENTS, UNIQUENESSES, AS WELL. 
WHILE THE INVESTIGATOR WAS UNABLE TO PROVE THAT MINIMUM-TRACE 
CRITERIA FIX COMMUNALI TIES UNIQUELY, A SET OF IDENTITIES AND 
INEQUALITIES WAS LOCATED AMONG THE COMMUNALI TIES AND, 
EQUIVALENTLY, AMONG THE UNIQUENESSES. PROOF AND ADDITIONAL 
RESULTS CONCERNING THESE IDENTITIES AND INEQUALITIES APPEARED 
IN AN APPENDIX. THESE IDENTITIES AND INEQUALITIES WERE OF 
DIRECT SIGNIFICANCE SINCE THEY AFFORDED A METHOD OF OBTAINING 
AN APPROXIMATE SOLUTION FOR THE MINIMUM-TRACE COMMUNALI TIES. 

NO TIME WAS AVAILABLE DURING THE PERIOD OF THE CONTRACT TO 
PERFORM THE NUMERICAL CALCULATIONS NECESSARY TO VERIFY THE 
FINDINGS. (GD) 



£0010599 



I 



-l- 



5 --$31x. 






A Study of an Exploratory Technique 
Used in Educational Research 



Cooperative Research Project No, S-431 



John N, Darroch 



University of Michigan 
Ann Arbor, Michigan 

1966 



» 



The research reported herein was supported by 
the Cooperative Research Program of the Office of 
Education, U. S. Department of Health, Education 
and Welfare. 



department of health, education and welfare 

Office of Education 

TWf document he* bean reproduced exactly »s recalved from the 

person or organ sat on creating It. Panic ot view or opinions 

•tated do not necessarily represent official Office of Education < 

position or policy. 









I 






Table of Content* 



o 

ERIC 



Page 

3 

4 

4 

5 
8 
8 



Problem 

Objective* 

Related research 

Findings 

Reference* 

Conclusions 



Appendix A: Proposal 

Appendix B: Some further inequalities and an 
identity in factor analysis 



I 



t 



m 3 •* 

PROBLEM. 

Let x m [ , , , x ] 1 be a vector of standardised (mean 

zero, variance one) random variables. The factor analysis model 
for x is * 



£ ■ 

where f, = [ . . . f q ] ' , q < p , * = [ » p ) ' . £ 

U p * q, E{t) = 0 , , E[:f£'] = = A. diagonal. 

Thus, if 2 denotes the covariance matrix of x , 

2 a AA' + A. 

***** 

2 2 2 

The communalities h. , h~ , . . . , h of x, , x, , . . . , x are the 

1 6 p \ it p 

diagonal elements of AA' or, in other words, the fractions of the 

variances of ^x^ x , attributable to the common factors 

2 2 2 

f j , f 2 , . . . » fq • The uniquenesses $i»$2 ,#,,, *p °* *i ,x 2 ,,,,,X p 

are the diagonal elements of the diagonal matrix A or, in other words, 

the fractions of the variances attributable to the "specific variables" 

z . 

P 



• • • • * 



The problem concerns the choice of the communalities when 
S is given. 

OBJECTIVES.. 

The general objective was to study the consequences of the minimum - 

2 2 2 

trace criterion for communalities. This says: choose h, f h„,... f h 
— — — — — ■ — i c p 

such that their sum is minimised. 

The particular objectives were as follows. 

(i) To prove or disprove that this criterion fixes the communalities uniquely. 

(ii) To investigate the relationship, if any, between the minimum-trace 
criterion and the minimum-rank criterion which says: choose the 
communalities so that q is as small as possible. 

(iii) To solve for the minimum-trace communalities. 

RELATED RESEARCH. 

Darroch (1965) gives inequalities which the communalities must 
satisfy regardless of which criterion is used to select particular values. 

These inequalities are relevant to objectives (i) and (iii). 

After writing the proposal.it was discovered that Ledermann (1939) 
had investigated our objective (ii). He constructs a numerical example 
of a 5x5 covariance matrix for which the minimum rank, q , is 
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Z but such that the minimum- trace is attained when q > 2. This 
demonstrates that there is no strict relationship between the two 
criteria and, in particular, that neither implies the other. Ledermann 
aoted that Thompson (1938) was the first person to propose the minimum- 
trace criterion. Thus, in the proposal for the present research, the 
description of this criterion as "new" was ill founded. 

References 

Darroch, J. N. , A set of inequalities in factor analysis, Psychometrika , 
1965, 30, 449-453. 

Ledermann, Walter, On a problem concerning matrices with variable 
diagonal elements. Proc. Roy. Soc. Edinburgh, 1939, 60, 1-17. 

Thompson, G. H. , Maximising the specific factors in the analysis of 
ability, Brit. Journ. Educ, Psych. y 8, 255-264. 

FINDINGS. 

Much effort was spent on objective (i) but, apart from the unimportant 
case when p . » 2 , we were unable to prove that die minimum-trace 
criterion fixes the communalities uniquely. 
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pointed out in the previous section, objective (ii) was effectively 
answered by Ledermann twenty- seven years ago. 

The principal findings concern a set of identities and inequalities 



among the communalities or, equivalently, among the uniquenesses 

p 



6 1 ,6 2 s * ’ • »*p • These are p identities E^Eg,...^ of which 



the first is 



^1-pf » *? + = Zi^A/SlUi 



where 



1 


— 

1 

1 <r, 

~1 




«f o' 


s 


, A = 






a, E.. 
**11 




~ ;Sn 




m J 





E 1 implies the inequalities 



V n)!l * p i i 6 f + s . ~n >r j?ii ~ ■ n = 1 , 2 , 3 .. 



r = 1 



The proofs and additional results concerning these identities and 
inequalities have been written up in a form suitable for publication 
in Appendix B . > 
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The identities and inequalities are of general significance for 
factor analysis since they show exactly what lies behind the inequalities 



1 - pf > 6? , i » 1,2 p, 



first proved in 1935 by Rolf. They are of direct significance for 
objective (iii) of this project since they afford a method of obtaining 



an approximate solution for the minimum -trace communal! tie 6, as 

> 2 ] 

P 



2 2 2 

follows. Let R(n) denote the region of points (6. , 6- , . . . , 6 ) 

1 2 p 



satisfying both 



6 i > 0 » * s •• »P . 



and the inequalities 



Ij(n) | i = 1,2,... , p . 

2 2 2 

The maximation of 6^ + 6^ + • . . +6 (equivalently the minimisation 

2 2 2 

of hj + h 2 + . . . + h ) in R(n) is a problem in nonlinear programming 
(linear if n a 1 ) and can be investigated by standard methods. See 
for instance Graves and Wolfe (1963). It seems clear that the accuracy 
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of this solution increases with n and that, as n -> oo , the 
approximate solution converges to the exact solution. However 
no time was available during the period of the contract to perform 
the numerical calculations which are necessary to verify the 
above statements. 



References 

Graves, Robert JL. and Wolfe, Philip, Recent advances in mathe- 
matical programming . McGraw-Hill, 1963. 

Roff, M. , Some properties of the communality in multiple factor 
theory, Psychometrika . 1936, 1, 1-6. 

CONCLUSIONS. 

More work is necessary to answer the questions: are the 
minimum- trace communalities unique and how can they be solved 
most efficiently? In this work the identities and inequalities given 
in Appendix B should prove invaluable. 
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1. ABSTRACT 

(a) The objectives are to study the consequences of a new criterion for 
commune lities. In particular (i) to discover whether the criterion defines 
coramunalities uniquely in all cases; (ii) to investigate the connection with 
the usual, but unsatisfactory, minimum-rank definition; (iii) to obtain an 
accurate and speedy method for solving for the communalities numerically. 

(b) The procedure for objectives (i) and (ii) will be mathematical and, 
for (iii) will be partly mathematical and partly computational. 
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2. PROBUSM 



Factor Analysis Is an eaqploratory technique used in Educational Research 
and in many other in^ortant fields. A factor analysis of a correlation matrix 
of a set of variables xi, x 2 , attempts to explain the correlation of 
the variables by their dependence on a set of common factors fi,f2,...fg. 

It is similar to a regression analysis but with the important difference 
that the independent variables fi,f 2, • • • ,fq are not observed and are sought 
for beneath the surface of the observations. 

■w 

Thus, in a recent paper in the Journal of Educational Psychology 
(1963), Wallen, Travers, Teid and Wodtke are able to represent twenty vari- 
ables describing teacher behavior in terms of five underlying factors which 
they label: cold, controlling versus warn, permissive; vigorous, dynamic 

versus dull, quiet; insecure, anxious versus confident; spends much time 

alone versus little time alone; much academic activity versus little academic 
emphasis . 

Factor Analysis is more ambitious than most other types of statistical 
. analysis and, partly because of this, there are still some basic ambiguities 
remaining despite all of the attention it has received since its inception 
by Charles Spearman i 1904. One of these ambiguities concerns the deter- 
mination of the communalities hi*, h/,...!^ 2 . The communality h ± 2 of the 
variable x ^ is the fraction of its variance attributable to the common fac- 
tors fi, f 2, • • • ,fg • No completely satisfactory criterion for determining 

2 
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the communalities has yet been offered and this is the basic problem which 
I wish to tackle. It might seem surprising that factor analysis has been 
used so much when there is this fundamental weakness in the theory. How- 
ever, there is much numerical evidence to^ suggest that the conclusions of 
a factor analysis are not greatly altered by changing the communalities and 
it is this fact which pemits many users of factor analysis to forget that 
the theory is intrinsically unsound. 

I wish to propose a new criterion for communalities which I believe 
will provide an acceptable solution to the problem. 
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3 . LITERATURE 



Factor Analysis Is used in Educational Research so extensively that it 
would be injpertinent to try and summarize the relevant literature. Practi- 
cally every issue of the Journal of Educational Psychology and many issues 
of "Educational and Psychological Measurement" and the Journal of Educational 
Research report at least one piece of research in which Factor Analysis was 
a vital tool. 

Turning to the general theory , the guiding principle is that the factor 
model should be parsimonious in the sense that 3 should be as small as 
possible. (The question of whether Nature operates in this parsimonious way 
is an interesting one but. one which cannot usually be answered.) If r denotes 
the modified correlation matrix in which the diagonal elements of one are re- 
placed by the communalities, then q is the rank of T. Therefore, by appeal- 
ing to the "principle of parsimony," we can say that the rank of T should be 
minimized with respect to hi 2 ,h a 2 , • . . ,h^ subject to the condition that T is 
kept non-negative definite (l.e. representable as a covariance matrix). In 
some cases this minimum rank condition fixes hi 2 , h a 2 ,...,h p 2 uniquely but 
in the majority of cases it does not ; Anderson and Rubin (1956) is the most 
important reference here. Moreover, even when the condition of minimum rank 
does lead to unique values of hi 2 ,ti a s , . . . ,h p 2 , there is no direct method of 
solving for these values. 
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Because of these difficulties, other criteria for communalities have 
been offered. Guttman (1956) proposed that the "best possible” estimate of 
h i 2 is Pi 2 vhere p ± is the multiple correlation coefficient of x ± with the 
remaining p-i variables. However, this leads to a T which is illegitimate 
in the sense of not being non-negative definite. To meet. this drawback 
Joreskog (1965) has proposed that l-h ± z » 0 (l-pj 2 ) where 0 is the largest 
number which leaves T non-negative definite. Both of these proposals are 
manageable computationally and their intuitive content derives from two 
sources. Firstly, the inequalities hi 2 > pi 2 which were first pointed out 
by Hoff (1936). Secondly, Gut tma^s theorem (1956) that, under very reason- 
able conditions, hi can be viewed as the multiple correlation coefficient of 
Xi with an infinite set of other relevant variables. Some tighter inequali- 
ties than the above are presented in the attached note. 

Likelihood -ratio hypothesis-testing provides an approximate decomposi- 
tion of the sample correlation matrix which parallels the exact decomposition 
of the population correlation matrix into T plus the diagonal matrix of "un- 
iquenesses 1 1-hi 2 . But the fact that the likelihood -ratio decomposition is 
approximate and assumes a normal distribution for the observations means 
that it is not strictly relevant to the present discussion. 

A closely related alternative to Factor Analysis is Image Analysis pro- 
posed by Guttman (1953). -Harris (1962) has shown that Image Analysis is 
closely related to a factor analysis which uses Guttman' s best possible 
communalities. In the same paper Harris also relates Guttman 's work to some 
proposals of Rao (1955) for deriving communalities iteratively once q has 
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been decided upon. 

A good overall discussion of the communallty problem Is to be found In 
Chapter 5 of the book by Harman (i960). 
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4* QBJSCTIVBS 

I propose the following criterion for communalltles: thet they be chosen 
to minimise their sun hi* ♦ h«* ♦... ♦ hp* « trace T, subject to r being posi- 
tive senl -definite. Shis definition can be viewed as an Interpretation of 
the principle of perslnony and it dovetails closely with the principal com- 
ponents analysis of T; a paper on this aspect of the minimum-trace criterion 
Is under preparation. 

My future objectives are to Investigate the consequences of this criterion 
under the following heads. 

(i) Uniqueness of hi*, he*,.,., hp*. I hope prove that the commune 11- 
ties are fixed uniquely. 

(11) Relationship between minimum rank and minimum trace. Preliminary 
investigation Indicates a close relationship and It may even be true that 
minimum trace Implies minimum rank. 

(ill) Methods of solution for hi*, hi*,...,hp*. 
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5 * PROCEDURES 



(a) The essential problem Involved In 4 (ill) Is to provide a good approxi- 
mation to the smallest root of a positive definite matrix. I have derived an 
upper bound for this root which is intimately related to the inequalities proved 
in the attached note; also a lower bound. Apart from possible theoretical re- 
finement of these rounds the main task will be to investigate numerically the 
accuracy of the communality solutions derived from them. This numerical work 
will require some time on The University of Michigan IBM 7090 computer. 

0 >)> («)> (d). Not applicable. 

(e) Not possible to give ary reliable estimates. 

6. PERSONNEL 

The principal investigator meets regularly with the mathematical psy- 
chologists at The University of Michigan and shall address their weekly 
seminar in the spring semester. Professor Paul Dwyer of the Department 
of Mathematics and the Statistics Laboratory stated that he wishes to co- 
operate on the computational aspects of this project. Among other things 
Professor Dwyer is a foremost worker in computational statistics and was 
responsible for some of the early mathematical development of Factor An- 
alysis. 

Biographical data concerning the principal investigator follows. 
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L Introduction and Summary 

As in [ 1 ] let jg denote the (non- singular) covariance matrix 

I 

of x = [ x. x ... x 1 where x. has mean zero and variance one. 

** 1 2 p i • 

1 — i — p. The first basic requirement of a factor -analysis model for £ 
is that g can be expressed as 



jc = t yiV y P ] 






+*• 

= [ z z . . . z ] , where 
l & p 

■ e(s) =,a 



and 



( 1 ) E[jz'] = 0, E [ ] = A, diagonal. 

Implicit in (1 ) is the condition that die matrix £ - A is non-negative 
definite since it can be written as E [ % £ ] • 

Write 

X x = t x 2 x 3 ...x p ]’ I r [y 2 y 3 ...y p ]' % = l z 2 V" z p ] '' 

3 U = . Ai= E ls 1 2 il. 

Si = s u Si • A = s{ Sn Si = eiSufir 

The vector gj is the vector of regression coefficients of x^ on 
and is die multiple correlation coefficient of x^ with 

In [l] we derived inequalities L, I , I , say, where 

12 p 



2 



2 . 



V 1 - pf i «5 + fil &iA 



6^ is the (1, 1) element of A, the "uniqueness" of . 1^ is an 
improvement over 

2 2 

V 1 * p i - *i 

which was previously derived by Roff [ 2 ] . It was shown that the 
conditions, say, for equality in 1^ are the same as the conditions 
for equality in J . 



In section 2 of l this paper the improvement from to 1^ 
is carried further, to ^(n) with conditions C^n) forequality. It 
is shown that 



C x (n) 



Cj , n = 1, 2, 



In section 3 the "ultimate" inequality ^ ( oo ) , obtained by 
letting n — ► oo in I^(n), is considered and the condition for equality 
is shown to be C ( oo) : y is a linear combination of y , y , • • • , y . 

* 2 3 p 

This condition will be recognized as the second basic requirement of a 
factor-analysis model. [ The variables y , y , . y are viewed as 

12 p 

linear combinations of at most p - 1 common factors all of which must 
appear with non-zero coefficients in at least one of y . . . , y : 

2 p 

otherwise they would not be common. ] Thus 1^ ( oo ) really states an 
identity, ( oo ) say, and it is this identity which is the most 
fundamental result given here, especially since all of the inequalities 
^1* V ***» ar ® derivable from it. 
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2. The Inequality I^(n) and Conditions lor Equality. 



Define 



* <n) - [I ♦ ( t + <«£*„>“ ] &1 . 



Then 



< z > [*-£] 



fc] ' f 



2 .2 



-1 



-1 . »n-l, _ l 



* *i • *i - M i + < s iiV + - + «uV 1ft 



*u<*n ^u* 



and 



O ) t i ; -*<»>][ JB-AJ I ••••• I • i - p? - (5„Vu> + • 

L “*1 (I * )J xf -1 A .2n. 

+ ^llAll> 1 6i • 

Because J - £ is non-negative definite, it follow* that the right side 
of (3 ) is non-negative. This gives us the inequality I^(2n) where 
Ij ( n ) is defined as 



2 . .2 



*i <»> : 1 - p‘ * 6‘ + Pi d u c l + <Sn $ u > + • • • + <5 ii £ u ) n ] 6, • 



-1 



-1 



The matrix it positive definite and therefore 



-1 



&1 A U <«U ^11* 6l - 0 



It follows that 



Ij(n)=> Ijin-l) 



^(JO) 



and that 



(4) C x <=> C x (0)=> C x (l) 



C x (n) t 



where Cj(n) denotes the conditions for equality in I^(n), n = 0, 1, 2,.... 
Now note that, if £ is a non-negative definite matrix 

JtTx " 0 £z • &• 

Equations ( 1 ) and ( 2 ) therefore show that 
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( 5 ) ( 2 n ) => Cj ( n - 1 ) , 

aiud ( 4 ) and ( 5 ) yield 

V*)-* C 



n * I, 2, • • * , 



n s 1, 2, • * • • 



3. The Inequality ^(oo) and the Condition for Equality . 

On letting n approach oo in ^ ( n ) we obtain 

V°»: 1-p^.sMo A u » r fc. 

THEOREM. The condition Cj ( co ) for equality in ^ ( oo ) is 

Cj ( oo ) : 2 - A has the same rank as 2^ - A^, 
or, in other words, 

Cj ( oo) : y^ is a linear function of y^, • • •, y . 

Before proving this theorem we state and prove two lemmas. 
LEMMA 1. Define 

©=r' V2 A S - 1/2 
» ~11 »11^ 11 * 

Then, as n — »■ oo, § converges to a matrix J , say, where 

$f - 1 • 

Proof . Let 8 denote an eigenvalue of |§) . Now g is non-negative 
definite and therefore 8 is real and 8 > 0. Also jE is 

non-negative definite and therefore so is Jt - ^ and consequently 8 < 1. 
Because 8 is real and | 8 | < 1 it follows that 

@ n -+ I 

where © J = § since & ® n = © n+1 . 

The main purpose of lemma 2 is to establish that, while 2 

r=0 

need not converge as n — > oo, the vector 



o 
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oo, the vector 



— 5 ** 



~YZ _ n ^ r * -K8 

«!<»)« 5 U 1 $u -Si 



does converge. 



LEMMA 2. As n — ► co, ^(n)— ► say, where 

^ 5ir^n ] % s £r 

Proof, From (1) we have 

E[ Xii{ ] ’ V An- El * (y ibi )] = l *i\ s u - £n ] 

Therefore there exists at least one solution v, of 

M 

t #5n"^n^ Ji *Z\* 

Therefore 

*i (n) " fti [ Jo ®' 1 [ £n- A u ] Ii 



= 2 
*11 



r 11 



Vi 



i^S’iii-eisu i, 



-V^ 



= r i . 0 “«i r rz v 

»11 1 A » J »H il 



Vi 



-* in 1 i - f 1 5,? V, =«1 -»y. 

Although a. is expressed as a function of the particular solution y 
it is of course the same for all v. . Finally 

= 1 £n* AiJ^i 

since [I - ® ] | = 0. 

[ Note that, if 2^ - is non-singular, then 0< $< 1, 

I = 0 and 0 1= [g u - Au 1 " 1 *!' 1 
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Proof of the theorem. Consider the squared multiple -correlation 

2 

coefficient, p ( y ^ | y^) of with It satisfies the equation 

(i- «5 >p* (yj ;&>- 

where, as in lemma 2, y is any solution of 

r* 1 



l Su’^u^i s Si 



In particular 



(1- «f > P 2 <Vi I &>= Sill 

-s i$n 2 Ko l ® T 2^Zi ] 

= p i + r ? 0 Si ~n ( wn ^ii } &r 

Thus the inequality I ^ ( oo ) may be expressed as 

Ij( <*>) : p 2 (y 1 I < 1 . 

and there is equality if and only if y. is a linear function of y,, •••, y , 

1 <£ p 

Since this condition is satisfied in a factor analysis model we have the 



identity 



E j ( o°) : 1 " 9 X ~ + o B| du *~11 £u* &r 
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